Block Algorithms for Sparse Matrix by Dense Matrix Multiplication
نویسندگان
چکیده
Sparse matrix computations appear in many linear algebra kernels of scienti c applications. The study, evaluation and optimization of sparse matrix codes is more complex than the dense case. Moreover, the irregularity of some memory accesses and the a-priory lack of knowledge of the number of iterations to be perfomed in some loops (both depending on the sparsity pettern) limit the succes of programmer and compilaer optimizations. In this report we consider some speci c examples in which we apply the evaluation techniques discussed in previous reports. We determine the perfomance of forms without blocking and show the improvement that can be obtained by using two levels of blocking (at the register and cache levels). Speci cally, we consider matrix multiplication C=C+A*B, where matrix A, of size NxN, is sparse and stored in row-wise ordered format and matrices B and C, of sizes NxP, are dense and stored by colums, as in fortram. Sparse matrix by dense matrix multiplications (SpMxM) appear, for instance, in iterative methods to obtain the eigenvalue vectors of a sparse matrix. 2
منابع مشابه
Cache Oblivious Dense and Sparse Matrix Multiplication Based on Peano Curves
Cache oblivious algorithms are designed to benefit from any existing cache hierarchy—regardless of cache size or architecture. In matrix computations, cache oblivious approaches are usually obtained from block-recursive approaches. In this article, we extend an existing cache oblivious approach for matrix operations, which is based on Peano space-filling curves, for multiplication of sparse and...
متن کاملTask-Based Algorithm for Matrix Multiplication: A Step Towards Block-Sparse Tensor Computing
Distributed-memory matrix multiplication (MM) is a key element of algorithms in many domains (machine learning, quantum physics). Conventional algorithms for dense MM rely on regular/uniform data decomposition to ensure load balance. These traits conflict with the irregular structure (block-sparse or rank-sparse within blocks) that is increasingly relevant for fast methods in quantum physics. T...
متن کاملA New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کاملA Simd Sparse Matrix-vector Multiplication Algorithm for Computational Electromagnetics and Scattering Matrix Models
Kipadia, Nirav Harish. M.S.E.E., Purdue University. May 1994. Pi SIMD Sparse Matrix-Vector Multiplication Algorithm for Computational Electromagnetics and Scattering Matrix Models. Major Professor: Jose Fortes. A large number of problems in numerical analysis require the multiplication of a sparse matrix by a vector. In spite of the large amount of fine-grained parallelism available in the proc...
متن کاملThe I/O Complexity of Sparse Matrix Dense Matrix Multiplication
We consider the multiplication of a sparse N ×N matrix A with a dense N ×N matrix B in the I/O model. We determine the worst-case non-uniform complexity of this task up to a constant factor for all meaningful choices of the parameters N (dimension of the matrices), k (average number of non-zero entries per column or row in A, i.e., there are in total kN non-zero entries), M (main memory size), ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994